Graph-Based Collective Lexical Selection for Statistical Machine Translation
نویسندگان
چکیده
Lexical selection is of great importance to statistical machine translation. In this paper, we propose a graph-based framework for collective lexical selection. The framework is established on a translation graph that captures not only local associations between source-side content words and their target translations but also targetside global dependencies in terms of relatedness among target items. We also introduce a random walk style algorithm to collectively identify translations of sourceside content words that are strongly related in translation graph. We validate the effectiveness of our lexical selection framework on Chinese-English translation. Experiment results with large-scale training data show that our approach significantly improves lexical selection.
منابع مشابه
A Hybrid Machine Translation System Based on a Monotone Decoder
In this paper, a hybrid Machine Translation (MT) system is proposed by combining the result of a rule-based machine translation (RBMT) system with a statistical approach. The RBMT uses a set of linguistic rules for translation, which leads to better translation results in terms of word ordering and syntactic structure. On the other hand, SMT works better in lexical choice. Therefore, in our sys...
متن کاملThe Correlation of Machine Translation Evaluation Metrics with Human Judgement on Persian Language
Machine Translation Evaluation Metrics (MTEMs) are the central core of Machine Translation (MT) engines as they are developed based on frequent evaluation. Although MTEMs are widespread today, their validity and quality for many languages is still under question. The aim of this research study was to examine the validity and assess the quality of MTEMs from Lexical Similarity set on machine tra...
متن کاملStatistical Machine Translation through Global Lexical Selection and Sentence Reconstruction
Machine translation of a source language sentence involves selecting appropriate target language words and ordering the selected words to form a well-formed target language sentence. Most of the previous work on statistical machine translation relies on (local) associations of target words/phrases with source words/phrases for lexical selection. In contrast, in this paper, we present a novel ap...
متن کاملEvaluation of EuroWordNet- and LCS-Based Lexical Resources for Machine Translation
We evaluate two types of lexical resources with respect to their applicability to interlingual machine translation: (1) a EuroWordNetbased database of bilingual links between Spanish and English words; and (2) a repository of semantically classified verbs with their corresponding Lexical Conceptual Structure (LCS) representations. We examine the utility of these two resources for the task of le...
متن کاملThe Effect of Lexicon-based Debates on the Felicity of Lexical Equivalents in Translating Literary Texts by Iranian EFL Learners
This study was an attempt to investigate the effect of lexicon-based debates on the felicity of lexical equivalents in translating literary texts by Iranian EFL learners. To fulfill the purpose of this study, 59 university students, majoring in English Translation, were randomly assigned to the experimental and control groups from a total of 73 students based on their performance on a mock TOE...
متن کامل